Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Package confusion attacks such as typosquatting threaten soft- ware supply chains. Attackers make packages with names that syntactically or semantically resemble legitimate ones, trick- ing engineers into installing malware. While prior work has developed defenses against package confusions in some soft- ware package registries, notably NPM, PyPI, and RubyGems, gaps remain: high false-positive rates, generalization to more software package ecosystems, and insights from real-world deployment. In this work, we introduce ConfuGuard, a state-of-art de- tector for package confusion threats. We begin by presenting the first empirical analysis of benign signals derived from prior package confusion data, uncovering their threat patterns, engineering practices, and measurable attributes. Advancing existing detectors, we leverage package metadata to distin- guish benign packages, and extend support from three up to seven software package registries. Our approach significantly reduces false positive rates (from 80% to 28%), at the cost of an additional 14s average latency to filter out benign pack- ages by analyzing the package metadata. ConfuGuard is used in production at our industry partner, whose analysts have already confirmed 630 real attacks detected by ConfuGuardmore » « less
-
Abstract As innovation in deep learning continues, many engineers are incorporating Pre-Trained Models (PTMs) as components in computer systems. Some PTMs are foundation models, and others are fine-tuned variations adapted to different needs. When these PTMs are named well, it facilitates model discovery and reuse. However, prior research has shown that model names are not always well chosen and can sometimes be inaccurate and misleading. The naming practices for PTM packages have not been systematically studied, which hampers engineers’ ability to efficiently search for and reliably reuse these models. In this paper, we conduct the first empirical investigation of PTM naming practices in the Hugging Face PTM registry. We begin by reporting on a survey of 108 Hugging Face users, highlighting differences from traditional software package naming and presenting findings on PTM naming practices. The survey results indicate a mismatch between engineers’ preferences and current practices in PTM naming. We then introduce DARA, the first automatedDNNARchitectureAssessment technique designed to detect PTM naming inconsistencies. Our results demonstrate that architectural information alone is sufficient to detect these inconsistencies, achieving an accuracy of 94% in identifying model types and promising performance (over 70%) in other architectural metadata as well. We also highlight potential use cases for automated naming tools, such as model validation, PTM metadata generation and verification, and plagiarism detection. Our study provides a foundation for automating naming inconsistency detection. Finally, we envision future work focusing on automated tools for standardizing package naming, improving model selection and reuse, and strengthening the security of the PTM supply chain.“The main idea is to treat a program as a piece of literature, addressed to human beings rather than to a computer”—D. Knuthmore » « less
-
Regular Expression Denial of Service (ReDoS) is a vulnerability class that has become prominent in recent years. Attackers can weaponize such weaknesses as part of asymmetric cyberattacks that exploit the slow worst-case matching time of regular expres- sion (regex) engines. In the past, problematic regular expressions have led to outages at Cloudflare and Stack Overflow, showing the severity of the problem. While ReDoS has drawn significant research attention, there has been no systematization of knowledge to delineate the state of the art and identify opportunities for fur- ther research. In this paper, we describe the existing knowledge on ReDoS. We first provide a systematic literature review, discussing approaches for detecting, preventing, and mitigating ReDoS vul- nerabilities. Then, our engineering review surveys the latest regex engines to examine whether and how ReDoS defenses have been re- alized. Combining our findings, we observe that (1) in the literature, almost no studies evaluate whether and how ReDoS vulnerabilities can be weaponized against real systems, making it difficult to assess their real-world impact; and (2) from an engineering view, many mainstream regex engines now have ReDoS defenses, rendering many threat models obsolete. We conclude with an extensive dis- cussion, highlighting avenues for future work. The open challenges in ReDoS research are to evaluate emerging defenses and support engineers in migrating to defended engines. We also highlight the parallel between performance bugs and asymmetric DoS, and we argue that future work should capitalize more on this similarity and adopt a more systematic view on ReDoS-like vulnerabilities.more » « less
-
This paper investigates how to efficiently deploy vision transformers on edge devices for small workloads. Recent methods reduce the latency of transformer neural networks by removing or merging tokens, with small accuracy degradation. However, these methods are not designed with edge device deployment in mind: they do not leverage information about the latency-workload trends to improve efficiency. We address this shortcoming in our work. First, we identify factors that affect ViT latency-workload relationships. Second, we determine token pruning schedule by leveraging non-linear latency-workload relationships. Third, we demonstrate a training-free, token pruning method utilizing this schedule. We show other methods may increase latency by 2-30%, while we reduce latency by 9-26%. For similar latency (within 5.2% or 7ms) across devices we achieve 78.6%-84.5% ImageNet1K accuracy, while the state-of-the-art, Token Merging, achieves 45.8%-85.4%.more » « less
-
Beginner musicians often struggle to identify specific errors in their performances, such as playing incorrect notes or rhythms. There are two limitations in existing tools for music error detection: (1) Existing approaches rely on automatic alignment; therefore, they are prone to errors caused by small deviations between alignment targets; (2) There is insufficient data to train music error detection models, resulting in over-reliance on heuristics. To address (1), we propose a novel transformer model, Polytune, that takes audio inputs and outputs annotated music scores. This model can be trained end-to-end to implicitly align and compare performance audio with music scores through latent space representations. To address (2), we present a novel data generation technique capable of creating large-scale synthetic music error datasets. Our approach achieves a 64.1% average Error Detection F1 score, improving upon prior work by 40 percentage points across 14 instruments. Additionally, our model can handle multiple instruments compared with existing transcription methods repurposed for music error detection.more » « less
-
We propose Vision Token Turing Machines (ViTTM), an efficient, low-latency, memory-augmented Vision Transformer (ViT). Our approach builds on Neural Turing Machines and Token Turing Machines, which were applied to NLP and sequential visual understanding tasks. ViTTMs are designed for non-sequential computer vision tasks such as image classification and segmentation. Our model creates two sets of tokens: process tokens and memory tokens; process tokens pass through encoder blocks and read-write from memory tokens at each encoder block in the network, allowing them to store and retrieve information from memory. By ensuring that there are fewer process tokens than memory tokens, we are able to reduce the inference time of the network while maintaining its accuracy. On ImageNet-1K, the state-of-the-art ViT-B has median latency of 529.5ms and 81.0% accuracy, while our ViTTM-B is 56% faster (234.1ms), with 2.4 times fewer FLOPs, with an accuracy of 82.9%. On ADE20K semantic segmentation, ViT-B achieves 45.65mIoU at 13.8 frame-per-second (FPS) whereas our ViTTM-B model acheives a 45.17 mIoU with 26.8 FPS (+94%).more » « less
An official website of the United States government

Full Text Available